Cache computation of JSON nodes to improve performance #995

hudson-ai · 2024-08-20T09:38:00Z

WIP. Seeing 4x speedup on large nested schema after adding caching, but we can probably squeeze more performance out of this. Not sure if the frozendict dep is strictly necessary; just trying out a quick idea.

…terparts; cache all low level funcs

codecov-commenter · 2024-08-20T09:48:59Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 61.56%. Comparing base (6eb08f4) to head (84bd1e0).

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

❗ There is a different number of reports uploaded between BASE (6eb08f4) and HEAD (84bd1e0). Click for more details.

HEAD has 57 uploads less than BASE

Flag BASE (6eb08f4) HEAD (84bd1e0)

124 67

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #995      +/-   ##
==========================================
- Coverage   70.25%   61.56%   -8.70%     
==========================================
  Files          62       62              
  Lines        4472     4483      +11     
==========================================
- Hits         3142     2760     -382     
- Misses       1330     1723     +393

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

riedgar-ms · 2024-08-20T14:14:23Z

guidance/library/_json.py

@@ -92,12 +92,12 @@ def validate_json_node_keys(node: Mapping[str, Any]):
        )


-@guidance(stateless=True)
+@guidance(stateless=True, cache=True)


If adding cache here gives a speed up, will this result in every generated integer being the same?

I'll certainly double check before marking this as non-WIP, but I believe it's just caching the return value of this function, which is a GrammarFunction, not the actual string value. In other words, I believe the answer to your question is no

Related question, assuming that the return value is OK: does this get cross-referenced properly when the grammar is generated? Looking at some of the serialisation code, I expect that it does, but would be good to make sure that it doesn't get expanded inline multiple times.

cache=True will ensure that subsequent calls with the same arguments will always return the exact same object (i.e. id(o1) == id(o2)).

The serialization code maintains its own cache for handling "pointers" (just integer indices) to objects in the serialized grammar, as you can see here. GrammarFunctions are hashable and just hash to the id of the object, so this code should correctly respect the object caching we do in the decorator.

hudson-ai · 2024-09-03T20:36:57Z

@riedgar-ms tests/unit/library/test_json.py::TestOneOf::test_oneOf_compound[True] is currently failing because the warning only happens on the first call (the second just hits the cache). Any thoughts?

…ike container object (#1007) Essentially reopening #638 with some major simplifications allowed by the rust re-write (the serialized grammar now directly supports "references"). Reopening because: #995 still takes far too long to process the large JSON schema. Line-profiling revealed that ~80% of the time spent constructing the `GrammarFunction` is due to `replace_grammar_node`. In particular, this function has to traverse the entire grammar tree, and we potentially have to call it many times if there is a lot of mutual recursion. Simply adding a node type that acts as a container (which we can fill after we decide what its contents should be) side-steps this problem entirely.

hudson-ai · 2024-09-06T18:32:32Z

An open question: is top/module-level caching a good idea? I could add a layer of encapsulation such that all of the underscore prefixed functions only maintain a cache within a single call to json... Thoughts?

Also, feeling pretty okay about the frozendict requirement. It helps a lot with making everything nicely hashable without too much fuss.

riedgar-ms · 2024-09-06T19:05:33Z

An open question: is top/module-level caching a good idea? I could add a layer of encapsulation such that all of the underscore prefixed functions only maintain a cache within a single call to json... Thoughts?

Also, feeling pretty okay about the frozendict requirement. It helps a lot with making everything nicely hashable without too much fuss.

You might want to cross check with Paul, since he's been doing the async/multithreaded stuff.

hudson-ai · 2024-09-06T19:17:43Z

@paulbkoch would love your input on the caching business here :)

hudson-ai · 2024-09-17T01:11:57Z

Currently, the performance gain on actually constructing the grammar doesn't seem super affected by this PR (I think thanks to #1007, which provided a significant boost on its own).

However, caching is still needed -- otherwise we hit a limit on the number of symbols that llguidance currently allows in a grammar. Raising that limit is a non-fix, as it has implications about the performance of mask computation.

paulbkoch · 2024-09-24T09:45:13Z

An open question: is top/module-level caching a good idea? I could add a layer of encapsulation such that all of the underscore prefixed functions only maintain a cache within a single call to json... Thoughts?

The one scenario I could think of where per-json call caching might be superior is in the case of a long running workload with constantly changing JSON schemas that are unique per call. In that case module level caching will keep consuming more and more memory. Given the JSON schemas and grammars are pretty compact, it would probably take a long time to exhaust memory though, so it's probably not the most important thing to solve. In terms of async, I can't think of an issue this caching would create, at least not after your other PR on fixing threaded grammars @hudson-ai.

The rest of the PR looked fine to me. I can't say that I understood all the JSON schema nuances here, but the application of caching to the existing code looked good to me.

hudson-ai · 2024-09-26T16:02:35Z

The one scenario I could think of where per-json call caching might be superior is in the case of a long running workload with constantly changing JSON schemas that are unique per call. In that case module level caching will keep consuming more and more memory. Given the JSON schemas and grammars are pretty compact, it would probably take a long time to exhaust memory though, so it's probably not the most important thing to solve. In terms of async, I can't think of an issue this caching would create, at least not after your other PR on fixing threaded grammars @hudson-ai.

Agreed. If I can get my other PR on method decoration together, it would make it really easy to cache on a per-schema basis (one schema = one instance of a GenJSON class, methods are cached at the instance level...).
If there are no objections, it would be great to merge this PR (and per-schema caching could go in a future one).

Just want to reiterate that this introduces a new dependency in frozendict

hudson-ai added 2 commits August 20, 2024 10:21

JSON: replace all mappings and sequences with hashable/immutable coun…

f132dfe

…terparts; cache all low level funcs

cache definition building

3484fe8

riedgar-ms reviewed Aug 20, 2024

View reviewed changes

Merge branch 'main' into json_caching

55d39e4

hudson-ai mentioned this pull request Sep 3, 2024

Remove Placeholder and replace_grammar_node in favor of pointer-like container object #1007

Merged

Merge branch 'main' into json_caching

a56c2ef

hudson-ai added 4 commits September 6, 2024 11:08

Merge branch 'main' into json_caching

d3c7014

typing: tuples should be variadic

1a37e04

move allof, oneof, const handling to their own functions

49aa500

clear caches in tests

5244b3b

hudson-ai requested review from riedgar-ms, Harsha-Nori and paulbkoch September 6, 2024 18:29

hudson-ai changed the title ~~[WIP] Cache computation of JSON nodes to improve performance~~ Cache computation of JSON nodes to improve performance Sep 6, 2024

Merge branch 'main' into json_caching

de55f4f

hudson-ai added 3 commits September 10, 2024 13:55

Merge branch 'main' into json_caching

c5a57ae

Merge branch 'main' into json_caching

d6e8c72

Merge branch 'main' into json_caching

6a96068

Merge branch 'main' into json_caching

95e05c5

Merge branch 'main' into json_caching

84bd1e0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache computation of JSON nodes to improve performance #995

Cache computation of JSON nodes to improve performance #995

hudson-ai commented Aug 20, 2024

codecov-commenter commented Aug 20, 2024 •

edited

Loading

riedgar-ms Aug 20, 2024

hudson-ai Aug 20, 2024

riedgar-ms Sep 3, 2024

hudson-ai Sep 3, 2024

hudson-ai commented Sep 3, 2024

hudson-ai commented Sep 6, 2024 •

edited

Loading

riedgar-ms commented Sep 6, 2024

hudson-ai commented Sep 6, 2024

hudson-ai commented Sep 17, 2024

paulbkoch commented Sep 24, 2024

hudson-ai commented Sep 26, 2024

Cache computation of JSON nodes to improve performance #995

Are you sure you want to change the base?

Cache computation of JSON nodes to improve performance #995

Conversation

hudson-ai commented Aug 20, 2024

codecov-commenter commented Aug 20, 2024 • edited Loading

Codecov Report

riedgar-ms Aug 20, 2024

Choose a reason for hiding this comment

hudson-ai Aug 20, 2024

Choose a reason for hiding this comment

riedgar-ms Sep 3, 2024

Choose a reason for hiding this comment

hudson-ai Sep 3, 2024

Choose a reason for hiding this comment

hudson-ai commented Sep 3, 2024

hudson-ai commented Sep 6, 2024 • edited Loading

riedgar-ms commented Sep 6, 2024

hudson-ai commented Sep 6, 2024

hudson-ai commented Sep 17, 2024

paulbkoch commented Sep 24, 2024

hudson-ai commented Sep 26, 2024

codecov-commenter commented Aug 20, 2024 •

edited

Loading

hudson-ai commented Sep 6, 2024 •

edited

Loading